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HIGHLIGHTS 


• A multiscale top-hat selection transform based algorithm is proposed. 

• Top-hat selection transform well differentiates and extracts the regions of interest. 

• The algorithm appropriately combines the extracted useful image information.*The algorithm produces a clear result contained more useful image 
information. 
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To effectively combine regions of interest in original infrared and visual images, an adaptively weighted 
infrared and visual image fusion algorithm is developed based on the multiscale top-hat selection trans¬ 
form. First, the multiscale top-hat selection transform using multiscale structuring elements with 
increasing sizes is discussed. Second, the image regions of the original infrared and visual images at each 
scale are extracted by using the multiscale top-hat selection transform. Third, the final fusion regions are 
constructed from the extracted multiscale image regions. Finally, the final fusion regions are combined 
into a base image calculated from the original images to form the final fusion result. The combination 
of the final fusion regions uses the adaptive weight strategy, and the weights are adaptively obtained 
based on the importance of the extracted features. In the paper, we compare seven image fusion meth¬ 
ods: wavelet pyramid algorithm (WP), shift invariant discrete wavelet transform algorithm (SIDWT), 
Laplacian pyramid algorithm (LP), morphological pyramid algorithm (MP), multiscale morphology based 
algorithm (MSM), center-surround top-hat transform based algorithm (CSTHT), and the proposed multi¬ 
scale top-hat selection transform based algorithm. These seven methods are compared over five different 
publicly available image sets using three metrics of spatial frequency, mean gradient, and Q, The results 
show that the proposed algorithm is effective and may be useful for the applications related to the infra¬ 
red and visual image fusion. 

© 2013 Elsevier B.V. All rights reserved. 


1. Introduction 

Fusion of image information from multiple imaging sensors is 
an important technique in different applications [1-4]. Especially, 
fusion of images obtained from infrared and visual imaging sensors 
is important to improve the performance of some military or civil 
applications, such as image based guidance, security surveillance 
and targeting [3,4]. Infrared (IR) images contain important image 
regions which could not be displayed by the visual image. How¬ 
ever, a visual image usually contains more image details than an 
infrared image. So, to obtain an image which contains the impor¬ 
tant image regions and more image details, the technique of infra¬ 
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red and visual image fusion is an important way to achieve this 
purpose. Usually, the used original infrared and visual images 
should be already registered. To obtain a good fusion result image 
with more useful image information, different algorithms have 
been proposed. Direct average algorithm is widely used in different 
applications [5,6]. But, this algorithm smoothes some important 
regions in the infrared image and image details in the visual image, 
which produces an un-clear result. The wavelet pyramid (WP) and 
curvelet transforms decompose the original images into different 
images representing the multiscale features of the original image, 
and these multiscale features are used for image fusion [6-13]. 
But, some important image details may also be smoothed, which 
will affect the performance of these algorithms. Although the shift 
invariant discrete wavelet transform algorithm (SIDWT) [10,11], 
which is the improved wavelet pyramid algorithm, performs better 
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for detail protection, some image details may be still smoothed and 
the result image may be not clear. Laplacian pyramid algorithm 
(LP) [6,11,14,15] could also extract the multiscale image features 
for image fusion. Although the result image is clear because some 
edge features may be strengthened, some image details may be 
smoothed and the contrast of the result image may be not good. 
Independent component analysis (ICA) or principal component 
analysis (PCA) based algorithms extract the main information of 
the original images and combine the extracted information 
together to form the final fusion result [16-18]. However, some de¬ 
tailed information which represents important image regions may 
not be preserved in the final fusion result. Neural networks are also 
applied for image fusion, which are mainly used for the multi-focus 
image fusion [19,20]. 

Mathematical morphology based algorithms are effective ways 
for image fusion [5,21-25]. These algorithms use morphological 
operations to extract the useful image features and combine these 
image features together to form the final fusion image. Among 
these morphological operations used for feature extraction, top- 
hat transforms show promising results. Through the multiscale 
extension of the classical top-hat transform by using the multiscale 
morphological theory (MSM), image features in the original images 
may be well extracted and combined into the final fusion image 
[5]. However, because the performance of opening and closing in 
the classical top-hat transform may be not good for the extraction 
of the regions of interest [22], the image details in the fusion result 
are not clear. To improve the performance of top-hat transform 
through constructing structuring elements, the center-surround 
top-hat transform based algorithm (CSTHT) is proposed [23]. 
Although it is effective for obtaining a fusion result with good con¬ 
trast [23], some image details are smoothed. This may make the fi¬ 
nal result un-clear. Also, extracting the focus regions from the 
multiscale features by top-hat transform performs well for multi¬ 
focus image fusion [25]. The algorithm may be not effective for 
infrared and visual image fusion. Moreover, by utilizing the mor¬ 
phological operations in the pyramid decomposition theory, the 
constructed morphological pyramid algorithm (MP) could also ex¬ 
tract the multiscale pyramid features for image fusion [21]. How¬ 
ever, because of the sampling, the morphological pyramid 
algorithm may produce some artifacts in the result image. This will 
affect the application of the fusion result. 

Actually, the regions of interest in infrared images are usually 
bright or dim regions comparing with the surrounding regions. 
Also, the useful image details in visual image are image regions 
which are different from the surrounding regions. These regions 
of interest in the infrared and visual images usually represent the 
useful and important image regions which are different from the 
surrounding regions. And, these regions of interest are the useful 
and important image information which should be combined into 
the final fusion image. So, one important part of infrared and visual 
image fusion is effectively extracting the important regions of 
interest which are different from the surrounding regions. The 
top-hat selection transform [24], which is the modification of the 
classical top-hat transform, could selectively output image regions 
following different application purposes. So, top-hat selection 
transform may be well used for the extraction of the regions of 
interest in infrared and visual images. Then, based on top-hat 
selection transform, an effective infrared and visual image fusion 
algorithm may be constructed. 

Through extracting the regions of interest, a multiscale top-hat 
selection transform based infrared and visual image fusion algo¬ 
rithm is proposed in this paper. First, specifying the top-hat selec¬ 
tion transform to extract the regions of interest and constructing 
multiscale top-hat selection transform using multiscale structuring 
elements with increasing sizes are discussed. Second, the multi¬ 
scale image regions of the original infrared and visual images are 


extracted. Third, the regions of interest represented by the final 
fusion regions are constructed from the extracted multiscale image 
regions. Finally, the final fusion image is obtained through import¬ 
ing the extracted regions of interest into a base image using the 
weight strategy. Because of the effectively extracting the regions 
of interest using the specified top-hat selection transform, the pro¬ 
posed algorithm performs well for infrared and visual image fu¬ 
sion. Experimental results on infrared and visual images verified 
the good performance of the proposed algorithm. 

2. Top-hat selection transform 

2.1. Mathematical morphology 

Mathematical morphology has been an important theory for 
image processing and pattern recognition after being proposed 
[21]. Morphological operations are mainly based on set theory. 
And, the two basic operations are dilation and erosion which use 
two sets. One set is the image and the other set is called structuring 
element. 

Let / and B represent the grayscale image and structuring ele¬ 
ment, respectively. The dilation and erosion of / (x, y) by B (u, v), 
represented by /©B and/© B, are given as follows, 

/ 0 B = max (f(x - u,y - v) + B(u , v)) (1) 

u,v 

feB = min (f(x + u,y + v) -B(u, v)) (2) 

u,v 

(x, y) and (u, v) are the pixel coordinates of/and B, respectively. 

Through combining the dilation and erosion, the opening and 
closing of / (x, y) by B ( u , v\ represented by fo B and /• B , are as 
follows, 

foB = (feB)®B, (3) 

f.B = (f®B)eB. (4) 

By comparing the result of opening or closing with the original 
image, the white top-hat transform and black top-hat transform of 
image /, represented by WTH and BTH , are as follows, 

WTH (x,y) — /(x,y) -/ o B(x,y), (5) 

BTH(x,y) =f •B(x,y) -/(x,y). (6) 

Opening is usually used to smooth bright image regions 

whereas closing is usually used to smooth dim image regions. So, 
WTH is used to extract bright image regions and BTH is used to ex¬ 
tract dim image regions. 


2.2. Top-hat selection transform 


Opening smoothes bright image regions and closing smoothes 
dim image regions. So, f[x , y)>fo B(x, y), /(x, y) ^/ • B(x, y). Take 
WTH as an example, because/(x, y) > f o B(x, y), WTH could be rec¬ 
ognized as a selective output, which could be rewritten as follows 
[24], 


WTH(x,y) 


(f(x,y ) -/ ° B(x,y), iff(x,y ) -/ o B(x,y ) > 0 
1 k, else, 


where k is an arbitrary value. 

Because top-hat transform does not well utilize the difference 
information between the regions of interest and the surrounding 
regions, the performance of top-hat transform is not effective. 
Through expanding and controlling the difference information be¬ 
tween the original image and the result of opening or closing fol¬ 
lowing expression (7), the performance of top-hat transform 
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could be improved. Based on this idea, the top-hat selection trans¬ 
form was proposed [24]. 

The white top-hat selection transform is given below. 


WTHS(x,y) 


(f(x,y) -foB(x,y ), ifU </(x,y) -foB(x,y) < t 2 

1 1, else 

( 8 ) 


where te(- oo, +oo), ti e [0, +oo) and t 2 e [0, +oo). 

Also, the black top-hat selection transform is given below. 


BTHS(x,y) 


(f.B(x,y) -f(x,y), ifU </»B(x,y) -f(x,y) < t 2 
\ t, e/se. 

( 9 ) 


WTHS and BTHS indicate that, through setting the values of t, t\ and 
t 2 in purpose, the difference information between the original 
image and the result of opening or closing could be well used for 
different applications [24], which improves the performance of 
top-hat transform. 

A flow graph of the calculation of WTHS and BTHS is shown in 
Fig. 1, which clearly shows the calculation of top-hat selection 
transforms from the basic morphological operations. The basic 
morphological dilation and erosion produce the opening and clos¬ 
ing operations. And, the difference operations of the original image 
and the result of opening or closing produce the classical top-hat 
transforms. By importing t, t\ and t 2 into the classical top-hat 
transforms, the top-hat selection transforms are obtained. 


3. Algorithm 

3.1. Extracting multiscale image regions 


3.1.1. Extracting regions of interest using top-hat selection transform 
In infrared image, the regions of interest are usually bright or 
dim regions whose main features are larger or smaller gray values 
than the surrounding regions. So, after the opening or closing, the 
changing of the gray values in the regions of interest is usually lar¬ 
ger than the surrounding regions. Thus, it is easy to extract the re¬ 
gions of interest through specifying the top-hat selection transform 
as follows [24], 


WTHS(x,y) 


(f(x,y) -f o B(x,y), iff(x,y ) -/ ° B(x,y ) > nL 
\ 0, else, 


( 10 ) 


BTHS(x,y) 


If • B(x,y ) -f(x,y), iff . B(x,y) -f(x,y ) > nL 
{ 0, else 

( 11 ) 


This definition indicates that, only the regions whose gray val¬ 
ues are larger than the surrounding region at least nL are the real 
regions of interest. This property is useful for extracting the regions 



Fig. 1 . Calculation of the top-hat selection transforms. 


Also, in visual images, the regions of interest which are different 
from the surrounding regions could be extracted using the similar 
way. Therefore, these extracted regions of interest which are 
important image features could be used for image fusion. 

Top-hat transform is an important operation in mathematical 
morphology and has been widely used in different image process¬ 
ing applications [21-24]. However, because of the detail smoothing 
of opening or closing operation in top-hat transform, top-hat trans¬ 
form could not well utilize the difference information between the 
regions of interest and the surrounding background regions [21- 
24]. So, the performance of top-hat transform for image processing 
is not good in some cases. To well utilize the difference information 
between the regions of interest and the surrounding background 
regions, the top-hat selection transform constructed through selec¬ 
tively output the difference information between the regions of 
interest and the surrounding background regions has been given 
[24]. Through designing the selection rules in the top-hat selection 
transform would result in effective tools for different applications. 

In this paper, the regions of interest usually have different gray 
values comparing with the surrounding background regions. So, it 
would be easy to discriminate the regions of interest through 
importing a value into the selection rules in top-hat selection 
transform. The important image features in the original infrared 
and visual images are usually different from the surrounding back¬ 
ground regions. These features may be well extracted through 
using a value which could discriminate the important features. 
This is useful for infrared and visual image fusion. Therefore, the 
strategy of simply specifying the top-hat selection transform for 
the purpose of infrared and visual image fusion in the expressions 
(10) and (11) is simple but reasonable. 

Expressions (10) and (11) require a minimum contrast in the infra¬ 
red images. The purpose of this paper is to combine the important re¬ 
gions of interest in the original infrared and visual images into the final 
fusion image. Because the regions of interest are usually different from 
the surrounding regions, which ensures that the needed contrast ex¬ 
ists. So, the definition of expressions (10) and (11) is reasonable. 

If the contrast of the infrared image is very low, the value nL 
used in the expressions (10) and (11) could be valued as a small va¬ 
lue, which will also effectively extract the needed image features. 
Moreover, the experimental results, including the qualitative and 
quantitative comparisons with some recent and effective algo¬ 
rithms, verified the effective performance of the proposed algo¬ 
rithm. Therefore, the proposed algorithm is an effective way for 
infrared and visual image fusion. 

For the case of noisy images, because the noises in image may 
be different from the surrounding regions, the proposed algorithm 
may also extract the noises and combine them into the final fusion 
image, which would affect the performance of the proposed algo¬ 
rithm. Fortunately, the images used in the application of this paper 
do not contain heavy noises. And, the experimental results verified 
that the proposed algorithm performed well on these images. 

3.1.2. Multiscale top-hat selection transform based region extraction 

Top-hat selection transform extracts image regions with size 
corresponding to the size of the used structuring element. Usually, 
the useful image regions have different sizes and exist at different 
scales. To extract all the useful image regions for image fusion, 
multiscale structuring elements with different sizes should be used 
[24]. 

Suppose n scales of structuring elements, Bi, B 2 , ..., B nt which 
have the same shape and increasing sizes, should be used. 
Bf = ffi efii ... 0 fli • 1 < i < n. 

dilation i times 

Based on the multiscale structuring elements, the bright and 
dim image regions in image / at each scale could be extracted as 
follows, 
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WTHSi(x,y) 


rf(x,y)-f°B,(x,y), 

to, 


f(x,y) -f °Bj(x,y) > nL 

else, 

( 12 ) 


BTHSi(x,y) 


| f*Bi(x,y)-f(x,y), 


f • Bi(x,y) -f(x,y) ^ nL 
else 


(13) 


WTHSi and BTHSt are the extracted bright and dim image regions 
corresponding to the scale i using structuring element B,. 


3A.3. Multiscale region extraction for image fusion 

Let f R and f VI represent the original infrared and visual images. 
Based on the multiscale top-hat selection transform, the extracted 
bright image regions of f m at scale i could be expressed as follows, 

ri/imjc tf m/v „V (f«(x,y)-fiR°B,(x,y), f m {x,y)-f, R °B,(x,y) s* nIR 
[WTHSi (f IR )](x,y) = | Q) dse 

(14) 

nIR is the parameter used by top-hat selection transform to extract 
the bright image regions in infrared image. 

Also, the bright image regions of f VI at scale i could be expressed 
as follows, 

[WTHSm](x,y) = l Mx ' y)/w(X ’ y) - /woBi(X ’ y) * nVI 

[0, else. 

(15) 


nVI is the parameter used by top-hat selection transform to extract 
the bright image regions in visual image. 

Similarly, the extract dim image regions of f IR and f VI at scale i 
could be expressed as follows, 


[BTHSi(f IR )] (x,y) 


(fnfBi{x,y)-f IR {x,y), iff IR »Bi(x,y)-f IR (x,y) ^ nIR 
lO, else, 

(16) 


[BiHSiCMK^y) 


f/w«Bi(x,y)-/ w (x,y), iff V i*Hxj)-fvi{xj) ^ nVI 
\o, else. 

(17) 


3 A A. Specifying ofnIR and nVI 

nIR and nVI are important parameters used in top-hat selection 
transform to extract the regions of interest, which is mainly valued 
based on the gray value difference between the regions of interest 
and surrounding regions. 

In infrared image, the regions of interest are bright or dim re¬ 
gions. And, the important regions of interest in infrared image 
are usually different from the surrounding regions. If the regions 
of interest are very bright or dim, nIR could be a large value to 
suppress the most of the other background regions. This would 
be useful to produce a clear fusion image with useful image 
information from the original images. Otherwise, nIR should be 
a small value to maintain the important image regions in the fi¬ 
nal fusion image. Correspondingly, if the regions of interest are 
very bright and dim, the gray value varying in image would be 
large, which results in a large standard deviation value of the 
gray values of the image. And, if the regions of interest are not 
very bright and dim, the standard deviation value of the gray val¬ 
ues of the image will be small. Therefore, nIR could be valued fol¬ 
lowing the standard deviation value of the gray values of image. 
Let o IR represent the standard deviation value of the gray values 
of the original infrared image. nIR is valued as follows in this 
paper. 

nIR = 0.4 x o IR . (18) 


nIR is smaller than a IR , which will ensure that nIR is smaller than the 
gray value difference between the regions of interest and the back¬ 
ground regions. Then, the regions of interest could be extracted by 
top-hat selection transform. Also, nIR is changed following <j ir , 
which may ensure that all the regions of interest could be extracted 
and the surrounding regions are suppressed. This would be useful 
for constructing the clear and effective fusion result. 

Visual image usually contains many image details and the re¬ 
gions of interest may be not very different from the surrounding 
regions. Then, nVI should be a small value, so that the important 
image regions will be well maintained in the result of top-hat 
selection transform. nVI is valued as follows in this paper. 

nVI = 0.2 x Gvi- (19) 

o VI is the standard deviation value of the gray values of the original 
visual image. nVI has the similar property as the nIR for region 
extraction by using top-hat selection transform. 

Actually, nIR and nIV are mainly used to suppress the possible 
backgrounds in the extracted features by top-hat transform at each 
scale following the varying of the gray values in an image, so that 
the regions of interest could be well extracted. Usually, because the 
extracted image features by top-hat transform are mainly useful 
image regions, the possibly suppressed backgrounds do not have 
large gray values in the result of top-hat transform. Thus, they 
could be easily suppressed by using a small and reasonable thresh¬ 
old value. nIR and nIV are defined based on the variance of the gray 
values of the original image with a small weight, which are usually 
small values and reasonable for discriminating the regions of inter¬ 
est and surrounding backgrounds. So, using the same values of nIR 
and nIV for all the scales is also effective. And, experimental results 
on all the images show the good performance of the proposed algo¬ 
rithm using this definition, which verified that this definition could 
be used for all the images and did not need to be changed from one 
image to another image. 

Also, using the same values of nIR and nIV for all the scales will 
simplify the implementation of the proposed algorithm, which 
makes the proposed algorithm more applicable. 

According to the definition, if the contrast of the original image 
is good, the extracted regions of interest may be very different 
from the surrounding regions. In this case, the standard deviation 
value of the original image would be large and thus leading to large 
nIR and nVI, which would well extract the regions of interest and 
suppress the affect of other regions. Conversely, if the contrast of 
the original image is low, the extracted regions of interest may 
be not very different from the surrounding regions. In this case, 
the standard deviation value of the original image would be small 
and thus leading to small nIR and nVI, which would also benefit the 
extracting of the regions of interest in images with low contrast. 
Therefore, setting the values of nIR and nVI in this way is reason¬ 
able. So, the proposed algorithm could be well used for images 
with different contrasts, not only the simple cases. 

Moreover, experimental results on different types of infrared 
and visual images have verified that the proposed algorithm was 
effective and performed better than some other algorithms. 

It should be pointed out that, defining the more appropriate val¬ 
ues of nIR and nIV for different scales may be one way of further 
improving the performance of the proposed algorithm. This would 
be further addressed in our future work. 

And, the proposed algorithm performs the fusion of infrared 
and visual images through extracting the regions of interest which 
are different from the surrounding regions. Because the noises in 
image are usually different from the surrounding regions, the ori¬ 
ginal infrared and visual images with noises may affect the perfor¬ 
mance of the proposed algorithm. However, usually, the original 
images may not contain very heavy noises. And, the image smooth¬ 
ing techniques could be used to smooth the noises before the 
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image fusion. This pre-processing will improve the performance of 
image fusion for images with noises. 

Therefore, because of nIR and nVI, top-hat selection transform 
could effectively extract the regions of interest and suppress the 
surrounding background regions, which will result in a clear fusion 
result image containing rich image details. 

3.2. Image fusion 

32A. Constructing the final fusion regions 

The extracted image regions at each scale are bright regions in 
the result of top-hat selection transform. Then, the fusion regions 
of the infrared and visual images at each scale should be the pix¬ 
el-wise maximum of the extracted infrared and visual regions at 
each scale. So, the bright ( WTHSi ) and dim ( BTHS s ) fusion regions 
of scale i could be expressed as follows, 


WTHSi = max{WTHSi(f IR ), WTHSi(f VI )}, (20) 

BTHSi = max {BTHSj (fi R ), BTHSj (fvi)}- (21) 

The extracted image regions in the result of top-hat selection 
transform have large gray values. Thus, the extracted image region 
at the scale should have larger gray values than the same region at 
other scales. Therefore, the final bright (RB) and dim ( RD ) fusion re¬ 
gions of all the scales could be obtained by applying the pixel-wise 
maximum operation on the extracted multiscale bright and dim 
image regions at all the scales as follows, respectively. 

RB = max{WTHS s }, (22) 

i 

RD = max{BTHS s }. (23) 

i 

These constructed final fusion regions could be used to calculate 
the final fusion image. 

To clearly show the final fusion regions and the extracted re¬ 
gions of interest at each scale, an example is shown in Figs. 2-4. 
The gray values of images in Figs. 3 and 4 have been scaled to 
an appropriate interval to obtain a good display effect to facilitate 
the visual observation of the readers. In Fig. 2a and b are the ori¬ 
ginal infrared and visual images. Because there is no need to 
show all the scales, we use one low scale and one high scale 
which are the second and sixth scales to show the effect of 
extracting image features at each scale. The extracted bright 
and dim and the fusion regions of the second and sixth scales 
from the original infrared and visual images in Fig. 2 are shown 
in Fig. 3a and b are the extracted bright image regions of the sec¬ 
ond scale from the original infrared and visual images, denoted 
by WTHS 2 (fi R ) and WTHS 2 (f VI ), respectively, (c) is the bright fusion 
regions of the second scale WTHS 2 . (d and e) are the extracted 
dim image regions of the second scale from the original infrared 
and visual images, denoted by BTHS 2 (f IR ) and BTHS 2 (f VI ), respec¬ 
tively. (f) is the dim fusion regions of the second scale BTHS 2 . (g 
and h) are the extracted bright image regions of the sixth scale 



from the original infrared and visual images, denoted by WTHS 6 (f IR ) 
and WTHS 6 (f VI ), respectively, (i) is the bright fusion regions of the 
sixth scale WTHS 6 . (j and k) are the extracted dim image regions 
of the sixth scale from the original infrared and visual images, de¬ 
noted by BTHS 6 (f IR ) and BTHS 6 (J VI ), respectively. (1) is the dim fu¬ 
sion regions of the sixth scale BTHS 6 . 

Fig. 3 shows that, the extracted bright or dim image regions of 
each scale are indeed the important bright or dim image regions 
in the original images. And, the extracted image regions are 
important image information in the original images, which are 
the useful information and should be combined into the final fu¬ 
sion image. Fig. 3 also indicates that, the extracted bright and 
dim image regions at each scale have large gray values. So, com¬ 
bining the extracted image regions at each scale using the pixel- 
wise maximum operation is reasonable. For example, in Fig. 3a 
and b, the extracted bright image regions of the second scale 
from the original infrared and visual images are brighter than 
other regions. Thus, after the pixel-wise maximum operation, 
the bright fusion regions of the second scale well combines these 
extracted bright image regions from the original infrared and vi¬ 
sual images, which is shown in Fig. 3c. Moreover, the extracted 
image regions at the low scale (Fig. 3a, b, d and e) usually repre¬ 
sent the important image regions with small size in images, and 
the extracted image regions at the high scale (Fig. 3g, h, j and k) 
usually represent the important regions in images. So, the algo¬ 
rithm well extracts the important image regions at different 
scales, which contain the useful image information of all the 
scales. Therefore, combining these extracted image regions into 
the final fusion image will results in an effective fusion image 
with the extracted image regions from the original infrared and 
visual images. 

Fig. 3 also shows that the extracted image features (especially 
the regions of interest) are different in the low and high scales. 
In the second scale which is a low scale, the extracted image fea¬ 
tures are mainly the image details and the regions of interest with 
small size. And, in the sixth scale which is a high scale, the ex¬ 
tracted image features are mainly the regions of interest with large 
size. Therefore, to extract all the image features, especially the re¬ 
gions of interest, at different scales for the effective infrared and vi¬ 
sual image fusion, multiscales of top-hat selection transform 
should be used. 

The final bright and dim fusion regions of all the scales, denoted 
by RB and RD , are shown in Fig. 4a and b, respectively. Because the 
fusion image regions of each scale have large gray values, the final 
fusion image regions could be easily obtained by using the pixel- 
wise maximum operation on all the fusion image regions of all 
the scales as expressions (22) and (23) shown. Fig. 4 shows that, 
the extracted bright and dim image regions from the original infra¬ 
red and visual images are well combined together. And, these final 
fusion regions are indeed the important regions in the original 
infrared and visual images. This would be useful for obtaining an 
effective fusion image. 

Because there is no need to show all the scales, we use one low 
and one high scales which are the second and sixth scales to show 
the effect of extracting image features at each scale. And, the gray 
values of the extracted image regions are large at each scale com¬ 
paring with other scales, the pixel-wise maximum operation is 
used to combine the extracted image features at all the scales as 
expressions (22) and (23) shown. 

3.2.2. Adaptive weight strategy based image fusion 

Combining these extracted image regions into a base image 
which contains the basic information of the original images will 
produce an effective fusion result. And, the pixel-wise averaging 
of the original images could be used as a base image containing 
the basic information of the original images as follows, 
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(g) 


(h) 


(0 



U) 


(k) 


(I) 


Fig. 3. The extracted bright and dim and the fusion regions of the second and six scales from the original infrared and visual images. 




(a) (b) 

Fig. 4. The final bright and dim fusion regions of all the scales. 

RA(x,y) = +/w(x,y) ( 2 4 ) 

RB and RD are the extracted final bright and dim fusion regions 
which represent the extracted regions of interest by the multiscale 
top-hat selection transform for image fusion. The important image 
regions in infrared image usually have good contrast comparing 
with the surrounding background regions. So, importing the ex¬ 
tracted bright and dim image regions into the base image through 
contrast enlargement may result in a fusion result image which 
has good visual effect and is clear. 

Moreover, bright or dim image regions with larger gray values 
contain more important image features which should be remained 


in the final result image more. So, the mean gray value of RB and 
RD, denoted by pb and pd, are used as the adaptive weights in 
the fusion result. Thus, the image fusion could be expressed as 
follows, 


F f = RA + RB x pb - RD x pd, 

(25) 

where 


pb = mean (RB), 

xy 

(26) 

pd = mean (RD). 

xy 

(27) 


In fp, the final bright fusion regions are added on and the final dim 
fusion regions are subtracted from the base image. Only using the 
multiscale technique in the procedure of the region extraction 
may lead to the loss of contrast information in the fused image. 
However, the strategy of adding the final bright fusion regions on 
and subtracting the final dim fusion regions from the base image 
will enhance the contrast of the image. So, the contrast of the fusion 
image would be good. And, the weights pb and pd ensure that, the 
bright or dim image regions which contains more image regions 
are adaptively remained in the final result image more, which will 
further enhance the contrast of the final fusion result. Therefore, 
the final fusion result will be effective and the image details will 
be clear. 
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3.3. Structuring element selection 

Structuring element is one important parameter in mathemati¬ 
cal morphology based algorithm. And, the shape and size of struc¬ 
turing element should be determined. 

The widely used shapes are rectangle, square, rhombus and cir¬ 
cle. However, because the circle shape does not have sharp corners 
and is benefit for suppressing the block effect, the circle shape has 
been widely used in different applications. So, in this paper, the 
shape of the structuring element is circle. Other types of shapes, 
such as rectangle, square and rhombus could be also used for the 
reason of easy implementation. The experimental results also ver¬ 
ified that, using circle shape was effective for all the used image 
sets in this paper. 

In this paper, the size of the structuring element is determined 
by the scale number. A large scale number will extract more image 
regions of more scales using larger structuring element, which may 
improves the performance of the proposed algorithm. But, usually, 
there is no need to use a very large scale number. And, we have 
tested the proposed algorithm on image sets from different appli¬ 
cations. The results show that, using 6 or 7 as the scale number 
would be enough for all the images. So, in this paper, n = 6. The ra¬ 
dius of the circle structuring element at each scale is the number of 
each scale. 

3.4. Implementation 

The implementation of the proposed algorithm is illustrated in 
Fig. 5. First, the multiscale bright and dim image regions are ex¬ 
tracted from the original infrared and visual images by using the 
multiscale top-hat selection transform, respectively. Second, the 
bright and dim fusion regions of each scale are calculated through 
the pixel-wise maximum operation. Third, the final bright and dim 
fusion regions of all the scales are produced based on the bright 
and dim fusion regions of each scale. Finally, the final fusion result 
is obtained by importing the final bright and dim fusion regions 
into the base image using a weight strategy. 

4. Experimental results 

Five image sets have been used in the experiment to verify the 
performance of the proposed algorithm. And, to do the comparison, 
the widely used wavelet pyramid algorithm (WP) [7,8], shift 
invariant discrete wavelet transform based algorithm (SIDWT) 
[10], Laplacian pyramid algorithm (LP) [14], morphological pyra¬ 
mid algorithm (MP) [21], multiscale morphology based algorithm 
(MSM) [5], center-surround top-hat transform based algorithm 
(CSTHT) [23] and the proposed algorithm have been applied on 
these image sets. The WP, SIDWT and LP are effectively and widely 
used multiscale technique based algorithms. The MSM and CSTHT 
are multiscale morphology based algorithms through designing 
morphological operators. MP is a multiscale technique based algo¬ 
rithm by combining the superiorities of pyramid technique and 
morphological operators. The proposed algorithm is also a multi¬ 
scale technique based morphological algorithm and intends to be 
an effective image fusion algorithm. So, the WP, SIDWT, LP, MP, 
MSM and CSTHT are used to do the comparison in this paper. 

4.1. Qualitative comparison experiment 

Figs. 6-9 list some comparison examples on images from the 
five image sets. In these figures a and b are the original infrared 
and visual images, (c) is the fusion result of the wavelet pyramid 
algorithm (WP). (d) is the fusion result of the shift invariant dis¬ 
crete wavelet transform based algorithm (SIDWT). (e) is the fusion 


result of the Laplacian pyramid algorithm (LP). (f) is the fusion re¬ 
sult of the morphological pyramid algorithm (MP). (g) is the fusion 
result of the multiscale morphology based algorithm (MSM). (h) is 
the fusion result of the center-surround top-hat transform based 
algorithm (CSTHT). (i) is the fusion result of the proposed 
algorithm. 

Fig. 6 is a comparison example on images from the “OctecWS” 
image set. In Fig. 6c, because the people target regions could not 
be easily recognized and the difference between the bright regions 
and dim regions is not good, the contrast of the result of WP 
(Fig. 6c) is not good. Also, the image details in this result are not 
clear, which result in an un-clear image. MP produces many arti¬ 
facts (Fig. 6f), which destroys some useful information of the origi¬ 
nal images. The contrast of the result of CSTHT (Fig. 6h) is good, but 
the image is not clear because some image details are smoothed. 
Comparing with LP, SIDWT and the proposed algorithm, MSM still 
smoothes some image details (Fig. 6g) which produce a not clear 
result image. Because the image details in the results of SIDWT 
(Fig. 6d), LP (Fig. 6e) and the proposed algorithm (Fig. 6i) are rich 
and the difference between the bright and dim image regions are 
good in these result images, SIDWT (Fig. 6d), LP (Fig. 6e) and the 
proposed algorithm (Fig. 6i) not only make the image details clear, 
but also get fusion results with good contrast. 

The proposed algorithm produces some artifacts in the text 
areas. The reason is these artifacts are extracted as the regions of 
interest in the original visual image, and thus they are maintained 
in the final fusion image. Actually, the text regions are not impor¬ 
tant information in the original images. So, even there are some 
artifacts, the further image analysis will not be affected. In another 
way, because the text areas are regions added on the image man¬ 
ually, they are different from the surrounding regions. Thus, effec¬ 
tively extracting these regions by the proposed algorithm also 
verifies the good performance of the proposed algorithm for infra¬ 
red and visual image fusion through extracting the regions of inter¬ 
est. Therefore, the proposed algorithm performs well for infrared 
and visual image fusion. And, in Fig. 6, the result of Laplacian pyr¬ 
amid algorithm introduces some artifacts around the tree area, 
which may affect the analysis of the fusion image. The proposed 
algorithm well maintains the image details comparing with other 
algorithms and does not introduce many artifacts except the un¬ 
important text areas. Therefore, the performance of the proposed 
algorithm is effective. 

Fig. 7 is a comparison example on images from the “UNcamp” 
image set. The results show that, all the algorithms could combine 
the useful image information of the original images and achieve 
the purpose of image fusion. Because some image details are 
smoothed and the difference between the bright and dim regions 
is not good, the result of WP (Fig. 7c) is not clear and the contrast 
is not good. The contrast of CSTHT (Fig. 7h) is good, but the result is 
not clear because some image details are smoothed. The results of 
LP (Fig. 7e) and MP (Fig. 7f) are clearer than CSTHT and the contrast 
is good, but the result images are not clear. SIDWT (Fig. 7d) and 
MSM (Fig. 7g) smooth some image details, especially in the regions 
of road. Because the regions of interest are well extracted and com¬ 
bined into the final fusion result, the result of the proposed algo¬ 
rithm (Fig. 7i) is clearer than other algorithms. Although the 
contrast of the people target region may be a little lower than other 
algorithms, the contrast is still good and the image details of the 
whole image are very clear. For example, the fence in the image 
is clearer than other results, which is because the regions of inter¬ 
est are well combined in the final fusion image. So, overall, the per¬ 
formance of the proposed algorithm is effective. 

Fig. 8 is a comparison example on images from the “Navi” im¬ 
age set. The original infrared image gives some important informa¬ 
tion, such as the road regions. And, the original visual image 
contains many visual image details, such as the tree regions and 
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Fig. 5. Implementation of the proposed algorithm. 




(a) Original infrared image (b) Original visual image (c) Result of WP 
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(d) Result of SIDWT 
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(e) Result of LP 
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(f) Result of MP 

©oct£c| 



(g) Result of MSM 


(h) Result of CSTHT (i) Result of the proposed 

algorithm 


Fig. 6. An example on OctecWS images. 
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(g) Result ofMSM (h) Result of CSTHT 


(i) Result of the proposed 
algorithm 


Fig. 7. An example on UNcamp images. 


some other regions. The result images show that, all the algorithms 
could combine these useful image regions. Because some image 
details are smoothed by WP and the difference between the bright 
and dim regions is not good, the result of WP (Fig. 8c) is not clear 
and the contrast is not good. The contrast of the results of the rest 
algorithms is good, because the difference between the bright and 
dim regions in these result images is good. But, CSTHT (Fig. 8h) and 
MSM (Fig. 8g) smooth many image details, which results in not 
clear images. MP (Fig. 8f) produces many artifacts, which will affect 
the application of the fusion image. LP (Fig. 8e) and SIDWT (Fig. 8d) 
perform better than other algorithms. But, comparing with the re¬ 
sult of the proposed algorithm (Fig. 8i), the contrast of the result of 
the proposed algorithm is good and there are more useful regions 
of interest in the result of the proposed algorithm because the pro¬ 
posed algorithm could well extract the regions of interest through 
the multiscale top-hat selection transform. 

Fig. 9 is a comparison example on images from the “Trees” im¬ 
age set. The original images contain many tree details and the 
“people” region is the important region of interest. The result of 
WP (Fig. 9c) is not clear because some image details are smoothed. 
MP produces many artifacts in the result image (Fig. 9f). MSM 
smoothes some image details (Fig. 9g). Although the result of the 
CSTHT (Fig. 9h) has a good contrast, the result is not clear because 
some image details are smoothed. The results of SIDWT (Fig. 9d) 
and LP (Fig. 9e) are better than MSM. But, comparing with the re¬ 
sult of the proposed algorithm, the results of LP and SIDWT are not 
clear. Because the proposed algorithm could well extract the re¬ 
gions of interest, the result of the proposed algorithm (Fig. 9i) is 
clear and the important image regions, such as “people” region, 
are well maintained. So, the proposed algorithm performs better 
than some other algorithms. 

In Figs. 8 and 9, because one of the original images is dark, some 
algorithms remain more information of the dark image in the final 
fusion image. Although the contrast of these fusion results looks 
good, the information of another original image is not well pre¬ 


served. So, the image details are not clear in these results. The pro¬ 
posed algorithm well combines the useful information of all the 
original images, which results in a clear and effective fusion result. 

To give an overall qualitative comparison, Table 1 shows some 
aspects of the performances of these algorithms, including the de¬ 
tailed features, clarity, contrast and artifacts reducing. These re¬ 
sults are obtained based on the overall quality comparisons on 
all the image sets performed by the researchers in our laboratory. 
“Some” means the fusion result contains some corresponding fea¬ 
tures but the performance is not good. “Acceptable” means the fu¬ 
sion result is acceptable corresponding to the features and the 
performance is better than “Some”. “Good” means the perfor¬ 
mance is the best among these results. 

Table 1 shows that, the clarity of SIDWT, LP and the proposed 
algorithm is better than other algorithms. And, the contrast of SID¬ 
WT, LP and the proposed algorithm is good. Also, SIDWT, LP and 
the proposed algorithm do not produce many artifacts. So, the per¬ 
formances of SIDWT, LP and the proposed algorithm are better 
than other algorithms. However, in the visual results in Figs. 6-9, 
the contained rich details in the results of the proposed algorithm 
are the most among these algorithms. Thus, the proposed algo¬ 
rithm performs well for infrared and visual image fusion, and the 
performance is better than other algorithms. 

All of these experimental results show that, because the pro¬ 
posed algorithm could well extract the regions of interest of the 
original images by using the multiscale top-hat selection trans¬ 
form, the fusion result of the proposed algorithm is effective and 
clear. Then, the result image could be well used for different appli¬ 
cations, such as target detection, object recognition, and image 
navigation. 

4.2. Quantitative comparison experiment 

Effective infrared and visual image fusion algorithm should well 
extract the important regions of interest and combine them into 
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(a) Original infrared image 






(b) Original visual image 


(c) Result of WP 



(d) Result of SIDWT 


(e) Result of LP 


(f) Result of MP 



(g) Result of MSM 


(h) Result ofCSTHT 


(i) Result of the proposed 
algorithm 


Fig. 8. An example on Navi images. 


the final fusion result image. Thus, the final result image should be 
clear and have good contrast. Moreover, for the purpose of fusion, 
the fusion result image should effectively import the original infor¬ 
mation of the original infrared and visual images into the final fu¬ 
sion image. And, the fusion result image should contain rich image 
details. 

To well evaluate the quantitative performance of the proposed 
algorithm and do the comparison, three measures, which are the 
spatial frequency (SF) [26], mean gradient (MG) [27] and image 
quality based measure (Q) [28], are used in this paper. Other mea¬ 
sures could be also used in this paper for the performance evalua¬ 
tion [6,11,27-31]. 

Suppose the fusion image f F has size M x N. The calculation of SF 
is as follows, 


A/?G — _-—- TtTj - 

(M- 1) x (N- 1) 

JVf-lN-l r - 

x EEvfrfty)-/ f(x- l,y)) 2 + (fr(x,y) -f F (x,y- 1)) 2 /2 

x=l y= 1 

( 31 ) 

Suppose the fusion image is f F , and the corresponding original 
images are f IR and f VI . The calculation of Q. is as follows, 

QVirJwJf) = T ^(Mw)Q 0 (f IR J F \w) + (1 - A(w))Qo(/W,/f|w)), 

wgW 

(32) 

where 


SF= + CF 2 , 
where 


RF = 




CF = 


1 


MxN' 


EliEL [fr(x,y) -fp(x,y - 1)] 2 - 


(28) 


(29) 

(30) 


A(w)=__ 

s(fe|w) + s(f w |w) 


(33) 


WirJfM is the computed result using the pixel values of f IR and f F 
in the small window w. Details about the selection of w are in [28]. 
W is the family of w. s{f IR |w) and s(J VI \w) could be valued as the var¬ 
iance of the pixels in w of f IR and f VI , respectively. Suppose a and b 
are the valued sequences of the pixel values in the w of f IR and f F , 
then 


G a b IClb 2G a Gt> 

CaCb X a 2 + b 2 X G\ + o \ 5 


The calculation of MG is as follows, 


Q.o(fo,/f|w) 


(34) 
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(a) Original infrared image (b) Original visual image (c) Result of WP 



(d) Result of SIDWT (e) Result of LP (f) Result of MP 



(g) Result of MSM (h) Result of CSTHT (1) Result of the proposed 

algorithm 

Fig. 9. An example on Trees images. 


Table 1 

Overall qualitative comparison. 



WP 

SIDWT 

LP 

MP 

MSM 

CSTHT 

The proposed algorithm 

Detailed features 

Some 

Acceptable 

Acceptable 

Some 

Acceptable 

Some 

Good 

Clarity 

Some 

Good 

Good 

Some 

Acceptable 

Acceptable 

Good 

Contrast 

Some 

Good 

Good 

Good 

Good 

Good 

Good 

Artifacts reducing 

Good 

Good 

Good 

Some 

Good 

Good 

Good 


where, a and b are the mean value of the pixel values in a and b , 
respectively. a a and o b are the variance of the pixel values in a 
and b, respectively. a a b is the covariance of the pixel values in a 
and b. QoC/w, /f|w) could be calculated using the similarly way. 

The measure SF is one well used measure in image fusion [26]. 
SF is defined based on the contained spatial image details and con¬ 
trast. Therefore, SF is an appropriate measure for quantitative com¬ 
parison of the performances of different algorithms. 

Moreover, using only one measure may be not enough for per¬ 
formance comparison because different measures may give differ¬ 
ent evaluation results [6,11,27-31]. So, to compare the 
performances of the image fusion algorithms using more measures, 
the well used gradient based measure which is mean gradient 
(MG) [27] and image quality based measure (Q) [28] are used in 
this paper. MG is a gradient based measure which has been well 
used in image fusion for measuring the clarity and contrast of im¬ 
age [27]. Q. is defined using the quality of image and based on the 
performance of algorithm for importing the original information of 
the original infrared and visual images into the final fusion image. 
Big values of SF, MG and Q. indicate good performance of the corre¬ 
sponding algorithm for infrared and visual image fusion. 

Images from five image sets are processed by the proposed 
algorithm and the comparison algorithms. The final fusion results 
of the images produced by each algorithm are used to calculate 
the values of quantitative measures. And, the mean value of the 


values of each measure corresponding to each algorithm on all 
the used images is shown in Figs. 10-12 to do the comparison, 
respectively. Figs. 10-12 list the quantitative comparisons using 
the measures SF, MG and Q, respectively. The image sets used in 
this paper to demonstrate the performance of the proposed algo¬ 
rithm are public data sets which do not contain much noise. Details 
of the comparison are shown below. 

Fig. 10 is the quantitative comparison using the measure SF. The 
proposed algorithm gives a bigger value than other algorithms, 
which means the fusion results of the proposed algorithm contain 
more image details than other algorithms. Also, the contrast is 
good. All of these are because the proposed algorithm could well 
extract the regions of interest in the original infrared and visual 
images and effectively combine them into the final fusion image. 
So, the performance of the proposed algorithm for infrared and vi¬ 
sual image fusion with emphasis on extracting regions of interest 
is better. 

Fig. 11 is the quantitative comparison using the measure 
MG. The MG value of the proposed algorithm is bigger than 
other algorithms. This means, the fusion result image of the 
proposed algorithm is clear and the contrast is good. These 
are because the regions of interest in the original infrared and 
visual images are well extracted and used for image fusion. 
So, the fusion result of the proposed algorithm is better than 
other algorithms. 
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WP SIDWT LP MP MSM CSTHT The 

proposed 

algorithm 

Fig. 10. Quantitative comparison using the measure SF. 



WP SIDWT LP MP MSM CSTHT The 

proposed 

algorithm 


Fig. 11. Quantitative comparison using the measure MG. 



WP SIDWT LP MP MSM CSTHT The 

proposed 

algorithm 

Fig. 12. Quantitative comparison using the measure Q. 


Fig. 12 is the quantitative comparison using the measure Q. 
Although the Q value is not the largest one, the difference between 
the Q. value of the proposed algorithm and the larger Q. values (SID¬ 
WT, LP) are very small. This means, the proposed algorithm also 
performs well for importing the original information of the original 
infrared and visual images into the final fusion image. 

Moreover, the SF and MG values of the proposed algorithm are 
larger than other algorithms. Therefore, the proposed algorithm 
could well extract and combine the regions of interest in the origi¬ 
nal images to achieve a clear fusion image with good contrast. And, 


Table 2 

Calculation time comparison of different algorithms (s). 


WP 

SIDWT 

LP 

MP 

MSM 

CSTHT 

The proposed algorithm 

0.574 

0.733 

0.082 

1.311 

0.923 

25.450 

3.874 


the original image information is well maintained in the fusion re¬ 
sult image of the proposed algorithm. 

To compare the calculation time of different algorithms, the 
algorithms are performed on infrared and visual images with size 
360 x 270 (CPU: Intel Pentium 4, 2.6 GHz; Memory: 512 MB). Each 
algorithm is performed several times. And, the mean time of each 
algorithm is used to do the comparison. The calculation time of 
each algorithm is shown in Table 2. 

Table 2 shows that, the proposed algorithm is not very fast. Be¬ 
cause the morphological operations have to be performed several 
times due to the multiscale morphological theory, the calculation 
of the multiscale morphological theory based algorithms (CSTHT, 
MP, MSM and the proposed algorithm) uses more time than other 
algorithms (WP, LP and SIDWT). And, because the calculation of the 
top-hat selection top-hat transform in the proposed algorithm is 
faster than the calculation of center surround top-hat transform 
in CSTHT, the calculation time of the proposed algorithm is very 
faster than CSTHT. But, because the top-hat selection transform 
is the modified version of the classical top-hat transform, and the 
comparison operation in the top-hat selection transform may oc¬ 
cupy some calculation time, the calculation time of the proposed 
algorithm is longer than MSM. However, several methods [32,33] 
which have been well used to speed up the calculation of the mor¬ 
phological operations for real time applications could be used to 
decrease the calculation time of the proposed algorithm. Moreover, 
the proposed algorithm performs well on different types of infra¬ 
red and visual images and the performance is better than some 
other algorithms. Therefore, the proposed algorithm is effective 
and useful for infrared and visual image fusion related applica¬ 
tions, such as target detection, object recognition, and image 
navigation. 

Also, to demonstrate the overall comparison of these algorithms 
based on the quantitative comparisons shown in Figs. 10-12 and 
Table 2, a new Table 3 is given. In Table 3, “Some” means the cor¬ 
responding quantitative value is worse than most of other algo¬ 
rithms. “Acceptable” means the corresponding quantitative value 
is not very different from the best one. “Good” means the corre¬ 
sponding quantitative value is the best among these algorithms. 
Table 3 shows that, although the performances of the proposed 
algorithm on the Q value and the calculation time are not the best, 
the performances are acceptable and not very different from the 
best ones. More importantly, the performances of the proposed 
algorithm on the spatial frequency and mean gradient are the best 
among these algorithms. Therefore, overall, the performance of the 
proposed algorithm is effective. Table 3 further verifies the effec¬ 
tive performance of the proposed algorithm for infrared and visual 
image fusion. 

These experimental results verify the effective performance of 
the proposed algorithm for infrared and visual image fusion with 
emphasis on extracting the regions of interest of different images 
from different applications. So, the proposed algorithm may be 


Table 3 

Overall comparison based on the quantitative comparisons. 



WP 

SIDWT 

LP 

MP 

MSM 

CSTHT 

The proposed algorithm 

Spatial frequency (Fig. 10) 

Some 

Some 

Acceptable 

Acceptable 

Some 

Some 

Good 

Mean gradient (Fig. 11) 

Some 

Some 

Some 

Acceptable 

Some 

Some 

Good 

Q value (Fig. 12) 

Some 

Good 

Acceptable 

Some 

Acceptable 

Acceptable 

Acceptable 

Calculation time (Table 2) 

Good 

Good 

Good 

Acceptable 

Good 

Some 

Acceptable 
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well used in applications related to infrared and visual image 
fusion. 

5. Summary and conclusions 

Infrared image contains important regions of interest while vi¬ 
sual image shows many useful image details. The useful image de¬ 
tails are also important regions of interest contained in the visual 
image. Image fusion is an effective technique to combine the useful 
information of the original infrared and visual images. To be effec¬ 
tive for infrared and visual image fusion, a multiscale top-hat 
selection transform based algorithm with emphasis on extracting 
the regions of interest is proposed in this paper. 

Top-hat selection transform well differentiates the regions of 
interest from the background regions, which extracts the regions 
of interest in the infrared and visual images. Thus, though combin¬ 
ing the extracted useful image information, the proposed algo¬ 
rithm could produce a clear result which contains more useful 
image information. Therefore, because of the effective extraction 
of the regions of interest and the reasonable combination of the ex¬ 
tracted image regions, the proposed algorithm performs well for 
infrared and visual image fusion. Moreover, automatic selection 
of the parameters in the top-hat selection transform and the fusion 
weights improve the adaptability of the proposed algorithm for 
different types of images. Experimental results show that, the pro¬ 
posed algorithm performs well on different types of infrared and 
visual images. Therefore, the proposed algorithm could be used 
in applications related to infrared and visual image fusion, such 
as target detection, object recognition, and image navigation. How¬ 
ever, if in the night-time scenes, visual images may not contain 
many useful image details, which will result in an unclear image 
fusion result. 
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